Search CORE

88 research outputs found

Coresets-Methods and History: A Theoreticians Design Pattern for Approximation and Streaming Algorithms

Author: Munteanu Alexander
Schwiegelshohn Chris
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

We present a technical survey on the state of the art approaches in data reduction and the coreset framework. These include geometric decompositions, gradient methods, random sampling, sketching and random projections. We further outline their importance for the design of streaming algorithms and give a brief overview on lower bounding techniques

Archivio della ricerca- Università di Roma La Sapienza

Probabilistic Smallest Enclosing Ball in High Dimensions via Subgradient Sampling

Author: Krivosija Amer
Munteanu Alexander
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 35th International Symposium on Computational Geometry (SoCG 2019)
Publication date: 01/01/2019
Field of study

We study a variant of the median problem for a collection of point sets in high dimensions. This generalizes the geometric median as well as the (probabilistic) smallest enclosing ball (pSEB) problems. Our main objective and motivation is to improve the previously best algorithm for the pSEB problem by reducing its exponential dependence on the dimension to linear. This is achieved via a novel combination of sampling techniques for clustering problems in metric spaces with the framework of stochastic subgradient descent. As a result, the algorithm becomes applicable to shape fitting problems in Hilbert spaces of unbounded dimension via kernel functions. We present an exemplary application by extending the support vector data description (SVDD) shape fitting method to the probabilistic case. This is done by simulating the pSEB algorithm implicitly in the feature space induced by the kernel function

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Random projections for Bayesian regression

Author: Geppert Leo N.
Ickstadt Katja
Munteanu Alexander
Quedenfeld Jens
Sohler Christian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/11/2015
Field of study

This article deals with random projections applied as a data reduction technique for Bayesian regression analysis. We show sufficient conditions under which the entire

d

-dimensional distribution is approximately preserved under random projections by reducing the number of data points from

n

k\in O(\operatorname{poly}(d/\varepsilon))

in the case

n\gg d

. Under mild assumptions, we prove that evaluating a Gaussian likelihood function based on the projected data instead of the original data yields a

(1+O(\varepsilon))

-approximation in terms of the

\ell_2

Wasserstein distance. Our main result shows that the posterior distribution of Bayesian linear regression is approximated up to a small error depending on only an

\varepsilon

-fraction of its defining parameters. This holds when using arbitrary Gaussian priors or the degenerate case of uniform distributions over

\mathbb{R}^d

for

\beta

. Our empirical evaluations involve different simulated settings of Bayesian linear regression. Our experiments underline that the proposed method is able to recover the regression model up to small error while considerably reducing the total running time

arXiv.org e-Print Archive

Springer - Publisher Connector

On large-scale probabilistic and statistical data analysis

Author: Munteanu Alexander
Publication venue
Publication date
Field of study

In this manuscript we develop and apply modern algorithmic data reduction techniques to tackle scalability issues and enable statistical data analysis of massive data sets. Our algorithms follow a general scheme, where a reduction technique is applied to the large-scale data to obtain a small summary of sublinear size to which a classical algorithm is applied. The techniques for obtaining these summaries depend on the problem that we want to solve. The size of the summaries is usually parametrized by an approximation parameter, expressing the trade-off between efficiency and accuracy. In some cases the data can be reduced to a size that has no or only negligible dependency on the initial number of data items. However, for other problems it turns out that sublinear summaries do not exist in the worst case. In such situations, we exploit statistical or geometric relaxations to obtain useful sublinear summaries under certain mildness assumptions. We present, in particular, the data reduction methods called coresets and subspace embeddings, and several algorithmic techniques to construct these via random projections and sampling

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung

Glow Discharge Optical Emission Spectrometry (GDOES), an Effectiveness Method for Characterizing Composition of Surfaces and Coatings

Author: Alexander SCHREINER
Daniel MUNTEANU
Publication venue: Galati University Press
Publication date: 01/11/2008
Field of study

Within the frame of this work, the technical procedures and real advantages of using Glow Discharge Optical Emission Spectroscopy (GDOES) for establishing depth concentration profiles of surfaces are presented. GDOES can detect low concentrations with high accuracy. It can be used for either quantitative bulk analysis (QBA) or quantitative depth profiling (QDP) in the nanometer to micron range. Non-conductive and conductive samples can be analysed. The main applications of this spectral method are related to different technology fields such as: heat treatment processes, casting, heat and cold forming processes, thermochemical treatments, electro-chemical processes (galvanic coatings), chemical and physical vapour depositions (CVD, PVD), thermal oxidation processes and anodizing, thin-films and others

Directory of Open Access Journals

Optimal Sketching Bounds for Sparse Linear Regression

Author: Mai Tung
Munteanu Alexander
Musco Cameron
Rao Anup B.
Schwiegelshohn Chris
Woodruff David P.
Publication venue
Publication date: 05/04/2023
Field of study

We study oblivious sketching for

k

-sparse linear regression under various loss functions such as an

\ell_p

norm, or from a broad class of hinge-like loss functions, which includes the logistic and ReLU losses. We show that for sparse

\ell_2

norm regression, there is a distribution over oblivious sketches with

\Theta(k\log(d/k)/\varepsilon^2)

rows, which is tight up to a constant factor. This extends to

\ell_p

loss with an additional additive

O(k\log(k/\varepsilon)/\varepsilon^2)

term in the upper bound. This establishes a surprising separation from the related sparse recovery problem, which is an important special case of sparse regression. For this problem, under the

\ell_2

norm, we observe an upper bound of

O(k \log (d)/\varepsilon + k\log(k/\varepsilon)/\varepsilon^2)

rows, showing that sparse recovery is strictly easier to sketch than sparse regression. For sparse regression under hinge-like loss functions including sparse logistic and sparse ReLU regression, we give the first known sketching bounds that achieve

o(d)

rows showing that

O(\mu^2 k\log(\mu n d/\varepsilon)/\varepsilon^2)

rows suffice, where

\mu

is a natural complexity parameter needed to obtain relative error bounds for these loss functions. We again show that this dimension is tight, up to lower order terms and the dependence on

\mu

. Finally, we show that similar sketching bounds can be achieved for LASSO regression, a popular convex relaxation of sparse regression, where one aims to minimize

\|Ax-b\|_2^2+\lambda\|x\|_1

over

x\in\mathbb{R}^d

. We show that sketching dimension

O(\log(d)/(\lambda \varepsilon)^2)

suffices and that the dependence on

d

and

\lambda

is tight.Comment: AISTATS 202

arXiv.org e-Print Archive

Cold War spy satellite images reveal long-term declines of a philopatric keystone species in response to cropland expansion

Author: Kamp Johannes
Klein Nadja
Koshkina Alyona
Kraemer Benjamin M.
Kuemmerle Tobias
Munteanu Catalina
Müller Daniel
Nita Mihai Daniel
Prishchepov Alexander V.
Publication venue: 'The Royal Society'
Publication date: 27/05/2020
Field of study

Agricultural expansion drives biodiversity loss globally, but impact assessments are biased towards recent time periods. This can lead to a gross underestimation of species declines in response to habitat loss, especially when species declines are gradual and occur over long time periods. Using Cold War spy satellite images (Corona), we show that a grassland keystone species, the bobak marmot (Marmota bobak), continues to respond to agricultural expansion that happened more than 50 years ago. Although burrow densities of the bobak marmot today are highest in croplands, densities declined most strongly in areas that were persistently used as croplands since the 1960s. This response to historical agricultural conversion spans roughly eight marmot generations and suggests the longest recorded response of a mammal species to agricultural expansion. We also found evidence for remarkable philopatry: nearly half of all burrows retained their exact location since the 1960s, and this was most pronounced in grasslands. Our results stress the need for farsighted decisions, because contemporary land management will affect biodiversity decades into the future. Finally, our work pioneers the use of Corona historical Cold War spy satellite imagery for ecology. This vastly underused global remote sensing resource provides a unique opportunity to expand the time horizon of broad-scale ecological studies

Copenhagen University Research Information System

University of Melbourne Institutional Repository